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ABSTRACT 


This  paper  investigates  the  use  of  risk-sensitive  filtering  for  state  and  param¬ 
eter  estimation  in  systems  with  model  uncertainties.  Modelling  uncertainties 
arise  from  imperfectly  known  input  process  and  noise  characteristics  as  well 
as  system  model  errors  such  as  uncertain  or  time  varying  parameters  of  the 
system  description.  No  new  convergence  results  are  given  in  this  paper  but 
simulation  examples  demonstrate  that,  in  some  situations,  risk-sensitive  fil¬ 
tering  and  estimation  techniques  allow  for  system  uncertainties  better  than 
optimal  techniques  such  as  Kalman  filtering. 


APPROVED  FOR  PUBLIC  RELEASE 


Department  of  Defence 

- ^ - 

defence  science  and  technology  organisation 


1^99  i3  OTl 


gjUg  ikbpbo^bd  a 


DSTO-TR-0764 


Published  by 

DSTO  Aeronautical  and  Maritime  Research  Laboratory 
PO  Box  4331, 

Melbourne,  Victoria,  Australia  3001 

Telephone:  (03)  9626  7000 
Facsimile:  (03)  9626  7999 

@  Commonwealth  of  Australia  1999 
AR  No.  AR-010-820 
January,  1999 


APPROVED  FOR  PUBLIC  RELEASE 


DSTO-TR-0764 


Risk-Sensitive  Filtering  and  Parameter  Estimation 


EXECUTIVE  SUMMARY 


In  control  applications,  including  the  control  and  guidance  loops  of  modern  guided  mis¬ 
siles,  filtering  and  system  identification  are  two  techniques  for  estimating  unknown  system 
information.  Filtering  provides  information  about  dynamic  missile  states  such  as  posi¬ 
tion  and  velocity,  while  system  identification  provides  information  about  approximately 
constant  quantities  (known  as  the  system  model)  that  describe  the  missile’s  behaviour. 

Traditional  filtering  techniques,  such  as  Kalman  filters,  rely  on  assumptions  about  the 
system’s  structure.  It  is  well  known  that  a  Kalman  filter’s  performance  can  be  dramatically 
reduced  by  errors  in  the  system  model  on  which  the  Kalman  filter  design  is  based.  For 
example,  if  a  missile  is  damaged  or  a  missile  is  operating  away  from  its  nominal  flight 
conditions  than  the  system  model  for  the  missile  will  be  incorrect  and  this  may  result 
in  poor  performance  by  the  Kalman  filter.  This  paper  is  concerned  with  techniques  for 
relaxing  some  of  the  system  model  assumptions  in  a  way  that  allows  the  performance  of 
filters  to  degrade  gracefully  when  faced  with  system  modelling  errors. 

The  main  technique  investigated  in  this  paper  is  risk-sensitive  filtering.  It  has  been  argued 
in  the  literature  that,  as  the  name  suggests,  risk-sensitive  filters  are  sensitive  to  the  risk 
(or  uncertainty)  in  a  system  model  and  are  better  able  to  allow  for  system  uncertainty 
(or  possible  errors  in  the  system  model)  than  so  called  “optimal”  methods  such  as  the 
Kalman  filter. 

This  paper  concludes  that  risk-sensitive  filtering  offers  advantages  over  more  traditional 
methods  such  as  the  Kalman  filter  when  the  system  is  not  known  with  complete  certainty 
(which  is  commonly  the  case).  Additionally,  this  paper  suggests  that  a  new  system  identi¬ 
fication  technique  known  as  risk-sensitive  parameter  estimation  may  offer  advantages  over 
existing  system  identification  techniques.  A  more  complete  investigation  and  theoretical 
basis  for  risk-sensitive  parameter  estimation  is  required. 

In  a  defence  context,  this  paper  suggests  that  risk-sensitive  filters  and  risk-sensitive  pa¬ 
rameter  estimation  techniques  may  improve  the  robustness  of  a  missile’s  control  loops. 
Improved  control  loop  robustness  may  enable  reasonable  missile  performance  when  a  mis¬ 
sile  is  damaged  or  the  missile  is  operating  away  from  its  nominal  flight  condition. 
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1  Introduction 


In  control  applications,  filtering  and  system  identification  are  two  techniques  for  estimating 
unknown  system  information.  System  identification  provides  estimates  of  model  param¬ 
eters  whilst  filtering  provides  estimates  of  dynamic  quantities  such  as  state  variables.  A 
typical  problem  would  involve  using  system  identification  to  estimate  the  system  model 
and  then  implementing  filters  based  on  the  estimated  system  model.  Filtering  and  sys¬ 
tem  identification  can  therefore  be  seen  as  complementary  techniques  that  can  be  used  in 
tandem  to  achieve  a  desired  objective. 

Both  system  identification  and  filtering  rely  on  assumptions  about  the  system’s  structure. 
In  filtering  problems,  system  characteristics  are  assumed  known  (ie.  the  true  system  model 
is  assumed  known).  While  system  identification  can  provide  estimates  of  the  unknown 
system  dynamics,  it  is  not  possible  to  know  the  system  with  complete  certainty.  Likewise, 
system  identification  itself  relies  on  assumptions  such  as:  the  true  model  is  in  the  restricted 
class  of  models  over  which  the  identification  is  performed[22].  This  paper  is  concerned 
with  a  technique  for  relaxing  some  of  the  assumptions  made  in  both  filtering  and  system 
identification. 

The  objective  of  the  standard  filtering  problem  is  to  find  the  state  estimate  for  which  the 
expected  variance  of  the  estimation  error  is  minimized[l].  This  minimum  variance  estima¬ 
tion  is  appealing  in  control  (and  other  applications)  because  it  can  be  seen  as  minimizing 
the  “energy”  in  the  estimation  error.  Unfortunately  techniques  which  are  designed  as¬ 
suming  complete  system  knowledge,  such  as  the  Kalman  filter,  do  not  necessarily  provide 
optimal  estimates  when  there  is  system  uncertainty [23]. 

Similarly,  when  system  identification  is  used  to  reduce  system  uncertainty,  it  should  be 
remembered  that  many  simplifying  assumptions  underlie  the  identification  process[22]. 
It  should  be  noted  that  system  identification  is  always  performed  over  a  restricted  class 
of  models,  eg.  linear  systems  of  fixed  order [22].  It  is  also  often  assumed  that  the  true 
model  is  a  member  of  the  identification  model  set.  In  many  applications,  the  objective  of 
identification  is  to  estimate  the  system  model  in  the  model  set  closest  to  the  true  system 
in  an  output  error  sense. 

In  control  applications,  the  measure  of  “closeness”  generally  used  is  the  prediction  error 
variance[22].  That  is,  the  estimated  model  is  the  model  whose  outputs  (predictions  of  the 
system  outputs)  are  closest  in  a  variance  sense  to  the  real  system  outputs.  In  this  way, 
system  identification  has  an  analogous  performance  index  to  the  filtering  problem. 

This  paper  considers  an  alternative  filtering  problem  known  as  risk-sensitive  filtering'^]. 
Although  this  paper  is  motivated  by  control  problems,  the  discussion  is  limited  to  zero 
input  systems  and  this  paper  is  a  preliminary  step  towards  applying  risk-sensitive  tech¬ 
niques  to  filtering  for  control  problems.  Risk-sensitive  filters  minimize  an  exponential  of 
the  error  cost,  which  penalizes  the  higher  order  moments  in  the  estimation  error,  that 
is,  moments  other  than  the  variance[5].  It  has  been  argued  that  system  uncertainty  ap¬ 
pears  in  these  higher  moments  and  hence  it  is  argued  that  risk-sensitive  filters  are  more 
“robust”  to  system  uncertainties  than  minimum  variance  estimators[5,  8]  .  This  can  be 
interpreted  as  meaning  that  the  risk-sensitive  filter  can  provide  estimates  that  are  better, 
in  an  error  variance  sense,  than  a  Kalman  filter,  when  both  are  based  on  the  same  model 
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assumptions [23].  A  complete  understanding  of  the  type  of  uncertainties  that  lead  to  this 
situation  has  not  yet  been  completed,  but  is  the  subject  of  continuing  research[23]. 

One  application  relevant  to  defence  is  the  design  of  control  loops  for  missiles.  Kalman 
filters  are  commonily  used  in  the  control  and  guidance  loops  of  modern  guided  missiles. 
Risk-sensitive  filters  and  other  techniques  may  improve  the  robustness  of  a  missile’s  control 
loops.  Improved  control  loop  robustness  may  enable  reasonable  missile  performance  when 
a  missile  is  damaged  or  the  missile  is  operating  away  from  its  nominal  fiight  condition. 

This  paper  also  proposes  a  risk- sensitive  parameter  estimation  problem.  When  the  true 
model  is  not  in  the  model  set  we  suggest  that  there  is  important  information  in  the  higher 
error  moments  of  the  prediction  error  (analogous  to  the  risk-sensitive  filtering  problem). 
It  is  suggested  in  this  paper,  without  proof,  that  a  risk-sensitive  parameter  estimation 
approach  allows  for  the  inability  of  the  model  class  to  perfectly  represent  the  true  system. 

Many  of  the  definitions  and  results  given  in  this  paper  are  well  known  in  the  established 
literature  on  risk-sensitive  filtering  and  control.  We  have  tried  to  reference  the  sources  of 
results  as  they  appear  in  this  paper. 

The  paper  is  organized  as  follows:  In  Section  2  the  risk-sensitive  filtering  problem  is 
presented.  The  risk-sensitive  filter  for  linear  systems  is  given  and  an  example  is  presented 
which  compares  the  Kalman  filter  with  the  risk-sensitive  filter  when  the  system  is  not 
known  with  complete  certainty.  This  section  is  a  review  of  existing  results.  In  Section 
3  a  new  research  problem  is  proposed  which  we  call  risk-sensitive  parameter  estimation. 
An  parameter  estimation  example  is  given  that  compares  the  use  of  Kalman  filter  state 
estimates  with  the  use  of  risk-sensitive  filter  state  estimates.  Finally,  in  Section  4  some 
conclusions  are  presented. 


2  Risk- Sensitive  filtering 

2.1  Minimum  Variance  Estimation 

We  proceed  with  the  notation  used  in  [6,  8].  Consider  the  following  stochastic,  discrete- 
time  state  space  system  (also  known  as  a  Gauss-Markov  linear  system)  defined  on  a  prob¬ 
ability  space  (S,^,  P): 

Xk+i  =  Axk  +  Bwk+1, 

Vk  =  Cxk  +  Dvk,  Uk  €  (2.1) 

where  k  e  Z+;  Xk,B  €  yk,D  G  Wk,Vk  E  R;  A  e  R^^^]  and  C  E 

Here,  Xk  denotes  the  state  of  the  system,  yk  denotes  the  measurement,  and  and  Vk  are 
the  process  noise  and  the  measurement  noise,  respectively.  It  is  assumed  that  the  noises  are 
independently  and  identically  distributed  (wd),  zero  mean  unit  variance  Gaussian  random 
variables,  ie  iV[0, 1].  We  denote  sequences  by  bold  face  letters  subscripted  by 

the  index  range,  for  example  the  sequence  {wq^  . . .  is  denoted  by  wo^fc.  It  is  assumed 
that  and  xq  are  mutually  independent.  We  also  assume  that  xq  (or  an  a  priori 

distribution  for  xq)  is  given.  For  simplicity  we  have  not  considered  time- varying  matrices 
but  these  are  not  excluded  by  the  theory. 
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The  conditional  mean  estimate  is  defined  as  follows 

xf'’  =  E[xk\yo,k,  xo],  (2.2) 

where  £?[.|.]  denotes  conditional  expectation  on  the  probability  space  {T,,T,P)  and 
denotes  the  conditional  mean  estimate  at  time  k.  The  conditional  mean  estimate  is  equiv¬ 
alent  to  the  minimum  variance  estimate  given  as  follows 

5^"  =  argnfin  {14(w)  =  E[{xk  -  w)'(a;fc  -  a;)lyo,fc,  Xq]}  .  (2.3) 

where  argminF(a;)  is  the  value  of  the  argument  u  that  minimizes  the  cost  F{(jj)  and  the 
prime  symbol '  denotes  the  transpose. 

It  is  well  known  [1]  that  for  Gauss-Markov  linear  systems,  eg.  (2.1),  when  the  system  is 
known,  that  the  optimal  filter  for  conditional  mean  estimates  is  the  Kalman  filter  which 
happens  to  be  a  finite-dimensional  filter. 

When  the  true  model  is  not  known,  the  Kalman  filter  implemented  assuming  a  model 
estimate  A,  gives  estimates 

^k\x  ^[^k\yo,k,xo,  A]  (2.4) 

where  denotes  the  Kalman  filter  estimate  at  time  k  based  on  the  assumed  model  A. 
k]X 

When  A  is  not  the  true  system,  the  Kalman  filter  estimates  are  generally  not  minimum 
variance  estimates  [23]. 

2.1.1  Estimator  Performance  Index 

The  performance  analysis  of  estimators  given  below  follows  the  presentation  given  in  [23]. 

To  enable  comparison  of  different  filters  (or  estimators)  we  introduce  a  cost  associated 
with  a  filter,  termed  the  expected  estimator  cost^  as  follows, 

W{cf>)  =  EAm<l>)]  (2.5) 

where 

W(</))  =  £;[(x,-x^|.)'(xfc-x^|-)|yo,fc,xo,A]  (2.6) 

Here  is  a  particular  filter  and  ^  denotes  the  estimate  of  x  from  the  filter  (j)  at  time  k 

K  j  A 

based  on  an  assumed  model  A.  The  cost  W {4>)  is  termed  the  estimation  cost  and  for  large 
k,  if  the  system  is  ergodic,  converges  to  the  measured  estimation  cost^ 

(2.7) 

The  symbol  E^[.]  denotes  expectation  on  the  probability  space  where  A  is 

the  set  (or  space)  whose  elements  a  denote  the  possible  dynamics  of  the  system,  Ta  is  a 
cT-algebra  on  A^  and  Pa  is  a  probability  function  on  Pa  which  denotes  the  probability  of 
particular  dynamics  a  C  .A.  This  probability  space  provides  a  probabilistic  description  of 
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the  unknown  system  dynamics  and  enables  comparison  of  different  filters  or  estimators.  It 
should  be  noted  that  we  consider  the  unknown  but  fixed  model,  A,  as  a  random  variable. 
This  is  a  different  approach  used  in  [23]. 

The  minimum  cost  estimator  is  defined  as 

$  =  arg  min  W  {(f>)  (2.8) 

where  0  denotes  the  minimum  cost  estimator  and  $  denotes  the  set  of  possible  estimators. 

When  A  has  one  member,  a  Gauss-Markov  linear  system  of  the  form  (2.1),  then  the 
Kalman  filter  is  the  minimum  cost  estimator  as  defined  by  (2.8). 

The  key  point  of  this  paper  is  that  for  other  A,  a  filter  other  than  the  Kalman  filter 
may  be  the  minimum  cost  estimator [23].  In  [23],  examples  axe  given  where  the  measured 
estimation  cost  of  the  risk-sensitive  filter  is  less  than  the  measured  estimation  cost  of  the 
Kalman  filter. 


2.2  Risk-Sensitive  Filtering 

The  following  description  of  risk-sensitive  filtering  comes  from  [8].  The  results  were  first 
established  in  [5].  Motivated  by  the  desire  to  improve  filter  performance  when  system 
uncertainties  exist  we  consider  a  filtering  problem  which  seeks  to  minimize  an  exponential 
of  the  error  performance  index. 

Analogous  to  the  minimum  variance  estimate  definition,  the  risk-sensitive  filter  estimate, 
based  on  an  assumed  model,  is  defined  as[8], 


(2.9) 

where 

Jjfc(a;)  =  E  [exp(0$o,fc(‘^))  |yo,fc,a;o,  a]  ,  0  >  0. 

(2.10) 

Here, 

-  <^)'Qk{xk  -  w), 

(2.11) 

where 

1  ^ 

(2.12) 

k=m 


where  6  >  0  is  the  risk  sensitivity  parameter  and  Qa:  >  0  is  a  weighting  matrix.  Here  X  is 
the  assumed  model  and  is  not  necessarily  equal  to  the  true  model  A. 

The  risk  parameter  can  be  thought  of  as  describing  the  amount  of  uncertainty  in  the 
system  description.  The  larger  the  6  value  the  greater  the  model.  A,  is  believed  to  be  in 
error.  Conversely,  as  6  ->  0  the  model  is  believed  with  more  certainty  and  the  risk-sensitive 
filter  approaches  the  Kalman  filter,  see  [8]  for  details. 

We  do  not  go  into  details  here  but  finite  dimensional  solutions  to  the  risk-sensitive  problem 
for  linear  systems  have  been  presented  previously  [6,  8].  We  present  the  risk-sensitive  filter 
for  linear  systems  in  a  later  section. 
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Remarks 

1.  The  risk-sensitive  state  estimate  does  not  have  an  interpretation  as  a  conditional 
mean  estimate. 

2.  This  is  perhaps  not  the  obvious  risk-sensitive  cost  but  it  is  shown  in  [8]  that 

^fe|A  =  argmin  E  exp  {^{xk  -  >^)'Q{xk  -  yo,fc,a:o,  A  (2.13) 

results  in  the  same  solution  as  the  minimum  variance  (or  risk-neutral)  problem. 


2.2.1  The  Risk-Sensitive  Cost  does  Penalize  Higher  Moments 


To  see  how  risk-sensitive  estimation  penalizes  higher  moments  consider  a  scalar  system 
model  and  set  =  1.  In  this  case  we  can  write  the  cost  as 

Jfc(a;)  =  E  exp  Qe(u;-Xfc)^ +  yo,jfc,a:o,A 


=  E  Fk-\,e  X  exp 


Q^(w  -  Xk)‘‘ 


yo,fc,2;o,A  ,  6>  >  0 


where  Fk-i,e  :=  exp  is  a  factor  independent  of  w.  Now  writing  the  second 

exponential  as  an  infinite  series  we  get 

T(  \  I  ie^{uj-Xkf  \  f 

Jk{^)  =  -jE  ^ - +  - - ^2 - +  ^ - ^3 - yo,fc,a;o,A 


The  terms  (cj  for  p  >  1  axe  the  higher  order  moments  that  are  not  considered  in 

minimum  variance  estimation.  That  is,  the  risk-sensitive  cost  penalizes  error  contributions 
from  these  higher  moments  whenever  0  >  0. 


2.2.2  Risk-Sensitive  Filtering  for  Linear  Systems 


Consider  the  Gauss-Markov  linear  system  given  earlier  (2.1).  The  following  theorem  holds. 
Theorem  1.  The  optimal  risk-sensitive  estimate,  defined  in  (2.9),  can  be  expressed 


+  K'  +  C'D-'C)-'C'D-\y,  - 


(2.14) 


where  {R/.  ^  -F  CD  —  6Q)  >  0  for  all  k  and  Rk  satisfies  the  following  Riccati  equation 


Rk+i  =B  +  A{R-^^  -f  C'D-^C  -  eQ)-^A',  J?o  >  0 


(2.15) 


Proof:  This  was  first  proven  in  [5].  It  is  also  shown  in  [8]. 
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Remarks: 

1.  The  risk-sensitive  filter  for  Gauss-Markov  linear  systems  is  finite  dimensional. 

2.  Note  that  the  risk-sensitive  filter  is  equivalent  to  the  Kalman  filter  when  0  =  0. 

3.  Unlike  the  Kalman  filter,  the  a  priori  and  a  posteriori  estimates  (or  one-step-ahead 
predictions)  for  the  risk-sensitive  filtering  problem  are  not  simply  related  through 
A.  See  [3]  for  details  of  the  predictive  risk-sensitive  filter. 

Example  1.  (Risk  Sensitive  Filtering.) 

To  demonstrate  the  possible  improvement  in  state  estimation  consider  the  linear  system, 
given  earlier  (2.1),  with  .4,  C,  B,  =  1  and  xq  =  0.  The  state  sequence  xo,fc  is  measured 
indirectly  via  the  observations  yo,fc. 

The  parameters  B  and  C  are  known  correctly,  but  A  and  D  are  not  known.  Consider 
the  filtering  problem  where  A  is  the  set  of  three  possible  models  with  [A  =  0.8,  B  =  1.2), 
[A  =  0.9,  D  =  1.2)  and  {A  =  1,D  =  1.2)  respectively.  Assume  that  the  a  priori  probability 
of  these  models  is  equal. 

We  compare  the  performance  of  the  Kalman  filter  and  a  risk-sensitive  filter  (0  =  0.5,  Q  =  1) 
on  the  basis  of  the  expected  estimator  cost,  ie.  (2.5),  and  measured  estimation  cost,  ie. 
(2.7). 

Figure  1  shows  both  the  risk-sensitive  {9  =  0.5,  Qk  =  1)  and  Kalman  filter  estimates 
against  the  true  state  value  using  the  assumed  model  A  =  0.9,  D  =  1.2.  The  risk-sensitive 
filter  has  smaller  measured  estimation  cost  than  the  Kalman  filter.  That  is,  = 

0.002442  while  W^{KF)  =  0.002480. 
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Figure  1  (U):  Comparison  of  risk-sensitive  filter  and  kalman  filter  estimates 

We  compared  the  performance  of  the  filters  is  two  ways.  Firstly,  we  compare  the  measured 
estimator  cost  of  the  two  filters  for  three  different  model  assumptions  when  filtering  data 


DSTO-TR-0764 


from  one  unknown  system  {A,B,  C,D  =  1).  Secondly,  we  compared  the  estimator  cost  of 
the  two  filters  (based  on  one  model  assumption  [A  —  0.9,  P  =  1.2))  when  filtering  data 
generated  from  the  three  systems  in  A. 

The  first  comparison  examines  the  effect  of  varying  model  assumptions  on  the  performance 
of  the  filters  while  the  second  comparison  examines  the  ability  of  one  fixed  filter  of  each 
type  (Kalman  filter  and  risk-sensitive  filter)  to  filter  data  generated  from  a  variety  of 
systems. 


Table  1:  The  performance  of  filter  for  different  assumed  models  when  true  model  is 
AC,B,D  =  \.  _ 


Model  {C^D  known) 

W^{KF) 

TF’^(R5) 

i  =  0.8,B  =  1.2 
i  =  0.9, 5  =  1.2 

A  =  1.0,B  =  1.2 

0.002515 

0.002480 

0.002618 

0.002505 

0.002442 

0.002529 

Table  2:  The  performance  of  filters  on  different  systems  when  assumed  model  is  A  = 
0.9,  B,C  =  1,D  =  1.2 _ 


Model  {B,C,D  =  1) 

W^{KF) 

W'^iRS) 

A  =  0.8 

0.002550 

0.002531 

A  =  0.9 

0.002480 

0.002442 

A=  1.0 

0.004887 

0.003923 

W{<f>) 

0.003306 

0.002965 

Table  1  shows  the  results  of  the  first  comparison  while  Table  2  shows  the  results  of  the 
second  comparison 

The  risk-sensitive  filter  performs  better  than  the  Kalman  filter  in  all  the  situations  pre¬ 
sented  in  the  tables.  Prom  Table  2,  and  using  the  fact  that  the  systems  are  ergodic,  a 
value  for  the  expected  estimator  cost  of  the  filters  can  be  calculated  and  for  this  example 
the  risk-sensitive  filter  has  the  lower  expected  estimator  cost. 


Risk- Sensitive  Filtering  Solutions  Summary 

Finite  dimensional  solutions  for  this  problem  have  be  found  in  particular  situations  includ¬ 
ing:  hnear  systems  [5,  8],  bilinear  systems  [7]  and  zero  process  noise  case  {wk  =  0)  [7].  It 
has  also  been  shown  that  finite-dimensional  filters  exist  for  a  class  of  discrete-time  nonlin¬ 
ear  systems[10].  Finite  dimensional  solutions  can  been  obtained  for  more  general  nonlinear 
systems  by  using  a  generalized  risk-sensitive  cost  index  which  is  chosen  to  absorb  the  con¬ 
tribution  from  the  nonlinear  terms[9].  Solutions  for  the  corresponding  continuous-time 
problem  are  also  available  for  linear  systems  [11]. 

The  foundations  for  the  risk-sensitive  problem  where  introduced  in  the  following  papers 
that  focus  on  the  risk-sensitive  control  problem[15,  18,  19],  see  the  remark  below.  Ap¬ 
plications  for  the  technique  are  described  in  [2,  13].  Recently,  a  book  on  risk-sensitive 
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control  has  been  published[20].  Later  work  on  the  risk-sensitive  problem  can  be  found  in 
[5,  16,  21]. 


Remark: 


1.  The  risk  sensitive  control, 
cost: 


J{u)  =  E 


is  defined  as  minimizing  the  risk-sensitive  control 


53  (^'kQkXk  +  u'l^RkUk)  +  )  , 

k=i  ) . 


where  u  =  {ui, ..ut’-i}  is  a  control  sequence,  ^  >  0  is  the  risk  parameter  and  T  is 
the  input  length.  It  is  generally  assumed  that  Qk>0  and  Rk>  0  for  all  k. 


3  Risk- Sensitive  Parameter  Estimation 

In  this  section  we  discuss  parameter  estimators  and  propose  a  risk-sensitive  parameter 
estimation  problem.  Before  introducing  the  parameter  estimation  problem  we  will  discuss 
the  sources  of  model  uncertainty  that  effect  parameter  estimation. 


3.1  Model  Uncertainty 

Parameter  estimation  or  system  identification  can  be  viewed  as  a  technique  to  allow  for 
model  uncertainty.  In  the  broadest  sense,  the  objective  of  system  identification  is  find 
the  system  that  created  the  observed  data.  However,  in  practice,  the  class  of  models  over 
which  the  search  is  performed  needs  to  be  restricted  for  computational  and  complexity 
reasons.  Assumptions  about  the  underlying  system  need  to  be  made.  The  objective  of 
system  identification  then  becomes  to  find  the  model  within  the  model  class  that  best 
describes  the  observed  data.  To  enable  searching,  model  classes  are  parameterized  and 
the  system  identification  problem  becomes  a  parameter  estimation  problem. 


3.2  Adaptive  Estimation 

The  need  to  use  adaptive  estimation  arises  in  situations  where  all  the  quantities  needed 
to  estimate  a  parameter  are  not  directly  available,  but  the  required  quantities  themselves 
can  be  estimated.  For  example,  consider  again  the  linear  system, 

Xk+I  =  Axk  +  Bwk+u  a:o  e  R^ 

Vk  =  Cxk  +  Dvk,  Vk^R  (3.1) 

where  k  G  Xk,B  G  yk,D,Vk  and  Wk  €  R;  A  f^NxN  ^  ^  Also,  the 

sequences  Wo,fc  and  vo,fc  are  sequences  of  iid,  zero  mean,  unit  variance  Gaussian  random 
variables.  It  is  assumed  that  wo,jt,  vo,jk  and  xq  are  mutually  independent  random  variables. 
Also,  it  is  assumed  that  a;o  is  given.  Here,  Xk  denotes  the  state  of  the  system  which  is 
observed  via  the  observations,  y^. 
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Assume  A,  B  and  D  are  known  and  that  we  are  interested  in  estimating  C. 

The  recursive  least  squares  algorithm  for  estimating  C  requires  that  Xk  is  known  (which  in 
this  problem  it  is  not).  However,  Xk  can  be  estimated  via  a  Kalman  filter.  If  the  estimates 
fi:om  a  Kalman  filter  are  substituted  into  the  least  squares  algorithm  in  lieu  of  Xk 

then  we  have  an  adaptive  estimation  algorithm  known  as  Extended  Least  Squares  (ELS). 
That  is, 

4+1  =  Ck  +  ^Pt(!/t-Ct4lx> 


Pk'  =  +  (3.2) 

where  x^J,~  are  Kalman  filter  estimates. 
bIa 

In  general,  convergence  results  for  ELS  algorithms  can  not  be  established;  however,  there 
are  many  adaptive  estimation  algorithms  for  which  strong  convergence  results  have  been 
established. 

Consider  again  the  linear  system  (3.1).  Assuming  that  B  and  D  are  known,  it  is  possible 
to  estimate  A  and  C  as  follows: 


Ak  = 

Ck  =  fkd^\ 


Ao  S 
Co  6 


when  ^  exists,  where  Aq  and  Co  are  initial  guesses  for  the  parameters  and 

Jfc  =  E[Jfclyo,fc,rco,  Ao,fc_i,Co,jfc_i], 

Ok  =  E[Ofc|yo,jfc,xo,  Ao,fc-i,Co,fc-i]  and 

fk  =  £?[Tfc|yo,fc,a;o,  Ao,fc_i,Co,fc_i].  (3.4) 

Here, 

k  k 

Jk  ■=  XI  and 

£=1  £=1 

k 

Tk:='£yex[.  (3.5) 

£=1 

Note  the  notation  Ao,fe_i  denotes  the  sequence  {Ao,  Ai, . . . ,  Ak-i}.  Filters  for  Jk,  Ok  and 
Tk  are  given  in  [12]. 

It  has  been  shown  in  [14]  that  if  the  output  data  was  generated  by  (3.1)  and  the  model 
order,  N,  is  known  then  the  estimates  Ak  and  Ck  almost  surely  converge  to  the  true  A 
and  C  model  parameters. 


3.3  Risk-Sensitive  Parameter  Estimation 


There  are  two  situations  in  which  a  risk-sensitive  approach  may  be  appropriate,  firstly, 
when  the  true  model.  A,  is  not  in  the  model  set.  A,  and  secondly,  from  poor  initializations 
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Aq.  The  adaptive  estimation  algorithm  (3. 3), (3. 4)  assumes  that  the  true  system  model 
is  in  the  model  set.  If  the  model  is  not  in  the  model  set  and  the  state  sequence  is  not 
measured  then  estimation  may  not  be  optimal  in  a  prediction  error  sense. 

Even  if  the  true  model  is  in  the  model  set,  convergence  close  to  the  true  model  may 
be  slow  from  poor  initial  guesses.  It  has  been  suggested  from  simulation  evidence  that 
a  risk-sensitive  approach  to  estimating  the  quantities  Jk^Ok  and  may  improve  con¬ 
vergence  from  poor  initializations[17].  These  two  situations  motivate  an  investigation  of 
risk-sensitive  parameter  estimation. 


3.3-1  Risk-Sensitive  Adaptive  Estimation 

In  this  subsection  we  propose  a  risk-sensitive  estimation  algorithm  without  study.  Consider 
again  the  linear  system  (2.1). 

Assuming  that  B  and  D  are  known,  it  is  possible  to  estimate  A  and  C  as  follows: 

Af  =  Jr  (of)"'" 

Cf®  =  (3.6) 

when  exist,  where  and  are  initial  guesses  for  the  parameters  and 

and  are  risk-sensitive  estimates  for  the  quantities  Ok  and  Tk  respectively. 

The  following  example  examines  the  use  of  risk-sensitive  filter  estimates  in  a  parameter 
estimation  problem. 

Example  2,  (Using  Risk-Sensitive  Filter  Estimates  for  Parameter  Estimation.) 

Consider  the  following  linear  system 

Xk+i  -  Axk  +  Bwk+i,  xoe  R 

Vk  ^  Cxk  "h  DVk-i  Vk  ^  ^ 

where  A  =  0.9,  C  =  1;  B,D  =  0.1;  xo  =  0  and  Wk^Vk  are  iid^  zero  mean  unit  variance 
Gaussian  random  variables.  Here,  the  state  sequence  xo,fc  is  measured  indirectly  via  the 
observations  yo,fc  and  we  are  interested  in  estimation  of  A. 

The  true  model  is  not  known  and  the  following  system  parameters  are  assumed:  B  =  0.1, 
C  =  0.6  and  D  =  0.1.  Our  initial  guess  for  A  is  =  0.6.  Here,  estimation  of  A  is 
performed  over  the  model  set  (B  =  0.1,  C  =  0.6  and  D  =  0.1)  which  does  not  contain  the 
true  system. 

If  the  state  sequence  Xq^a:  was  measured  then  the  least  squares  estimate  of  A  would  be 

where  T  is  the  number  of  data  points. 
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However,  when  the  state  sequence  is  not  measured  then  a  multi-pass  missing  data  approach 
[14,  24]  can  be  used.  Here  filter  estimates  of  the  state  are  used  in  lieu  of  the  true  state 
and  A  estimated  on  pass  i  as  follows: 

where  is  a  estimate  of  the  state  at  time  k  based  on  model  assumptions  {B  =  0.1, 

C  =  0.6  and  D  =  0.1)  and  either  fi:om  the  Kalman  filter  or  from  a  risk-sensitive 
filter.  Passes  though  the  data  are  performed  until  A^  converges  to  some  value. 

We  compare  parameter  estimation  using  Kalman  filter  estimates  with  estimation  using 

/C|  A 

risk-sensitive  filter  estimates 

k\X 

A  data  set  of  1000  points  was  generated  with  the  above  parameter  values.  First,  Kalman 
filter  estimates  were  used  and  after  10  passes  A  was  estimate  as  0.9474.  Then  risk-sensitive 
filter  estimates  [6  =  35,  Q  =  1)  were  used  and  after  10  passes  A  was  estimated  as  0.9049. 
This  corresponds  to  an  improvement  in  model  performance,  as  measured  by  filtered  output 
error  (that  is,  E[[yk  —  y^)^lyo,A:?^o]  ~  ^  from  0.002964  for  the  Kalman 

filter  estimate  to  0.001879  for  the  risk-sensitive  filter  estimate. 

Convergence  to  these  values  occurred  for  a  range  of  choices  for  Similar  improvements 
in  estimation  of  A  occurs  using  risk-sensitive  filters  if  the  assumed  model  had  C  =  0.8  or 
C  =  0.9. 


□ 


Remarks 

1.  The  parameter  estimation  problem  and  the  approach  presented  in  the  above  example 
is  admittedly  contrived  and  unlikely  to  occur  in  practice.  Estimation  of  both  A  and 
C  using  standard  techniques  would  be  an  obvious  approach  and  would  result  in 
a  better  model  estimate.  However,  the  success  of  the  risk-sensitive  approach  in 
this  artificial  problem  motivates  investigation  of  risk-sensitive  approaches  in  more 
complicated  problems. 

2.  The  more  usual  measure  of  model  performance  is  the  prediction  error  but  the  risk- 
sensitive  filter  shown  in  this  paper  can  not  be  used  to  generate  predictions  (see 
early  comment  and  see  [3]  for  the  risk-sensitive  predictor).  Hence,  for  convenience 
the  filtered  output  error  has  been  used  for  comparison  instead.  There  is  a  similar 
improvement  in  the  prediction  error  of  the  risk-sensitive  predictor  over  the  Kalman 
filter  predictor  when  they  are  based  on  the  models  estimated  in  this  example. 

3.  The  missing  data  approach  used  in  this  problem  can  be  considered  an  example  of 

an  adaptive  estimator  (3.6)  where  Convergence  results  or 

properties  have  not  yet  been  established  for  the  presented  risk-sensitive  estimation 
algorithm. 
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4  Conclusion 

Unknown  model  dynamics  can  make  the  control  problem  difficult.  Filtering  and  system 
identification  are  two  techniques  used  to  handle  system  uncertainties. 

This  paper  repeated  the  known  risk-sensitive  filtering  problem  and  the  known  solution  for 
Gauss-Markov  linear  systems.  An  example  was  presented  which  compared  the  Kalman 
filter  with  a  risk-sensitive  filter  in  a  situation  where  the  system  parameters  were  not  known 
completely. 

The  key  contribution  of  this  paper  is  the  proposal  of  the  risk-sensitive  parameter  estimation 
problem.  An  example  was  presented  which  demonstrates  a  possible  application  of  a  risk- 
sensitive  approach  to  the  parameter  estimation  problem.  In  this  example,  it  was  shown 
that  a  better  model  (in  an  output  error  sense)  could  be  found  by  using  risk-sensitive  state 
estimate  than  by  using  Kalman  filter  state  estimates.  This  suggest  that  a  more  theoretical 
and  complete  investigation  of  risk-sensitive  parameter  estimation  may  be  worthwhile. 
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